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1.  Introduction. 

A fundamental  problem  in  numerical  analysis  is  the  solution  of  a ( 

system  of  linear  equations  Ax  = b , where  A is  an  n x n matrix  of 
coefficients,  x is  an  n x 1 vector  of  variables,  and  b is  an  nyl 
vector  of  constants.  Efficient  methods  for  solving  Ax  = b , such  as 
Gaussian  and  Gauss  - Jordan  elimination,  have  long  been  known.  These 
methods  have  been  repeatedly  rediscovered  and  applied  in  other  contexts. 

j 

For  example,  Floyd's  shortest  path  algorithm  [7]>  which  is  based  on 
Warshall's  transitive  closure  algorithm  [32],  is  a version  of  Gauss  - Jordan 
elimination.  Kleene's  method  for  converting  a finite  automaton  into 

’ 

a regular  expression  [20]  is  a form  of  Gauss  - Jordan  elimination; 

Gaussian  elimination  also  solves  this  problem  [3].  In  all  these 
situations  the  problem  of  interest  can  be  formulated  as  the  solution 
of  a system  of  linear  equations  defined  not  over  the  field  of  real 
numbers  but  over  some  other  algebra. 

In  this  paper  we  provide  a unified  setting  for  such  problems.  Our 
goal  is  to  show  that  a solution  to  one  of  them  can  be  used  to  solve  them 
all.  One  approach  to  this  task  is  to  develop  a minimal  axiom  system  for 
which  elimination  techniques  work  (see  for  instance  Aho,  Hopcroft,  and 
nllman  [l]  and  lehman  [21])  and  to  show  that  the  problems  of  interest  satisfy 
the  axioms.  Our  approach  is  somewhat  different  and  resembles  that  taken 
by  Backhouse  and  Carre  [3];  we  believe  that  the  proper  setting  for  such 
problems  is  the  algebra  of  regular  expressions,  which  is  simple,  well-understood, 
and  general  enough  for  our  purposes. 
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2. 


Regular  Expressions  and  Path  Expressions. 


Let  2 he  a finite  alphabet  containing  neither  " A " nor  " p 
A regular  expression  over  2 is  any  expression  built  by  applying  the 
following  rules; 


(la)  " A " and  " P " are  atomic  regular  expressions;  for  any 
a c 2 , " a " is  an  atomic  regular  expression. 

(lb)  If  R^  and  Rg  are  regular  expressions,  then  (R-^ijRg)  > 

-X- 

(Rl'Rg)  , and  (R^)  are  compound  regular  expressions. 

In  a regular  expression,  A denotes  the  empty  string,  P denotes 

the  empty  set,  U denotes  set  union,  • denotes  concatenation,  and 

* / 

* denotes  reflexive,  transitive  closure  (under  concatenation).-'  Thus 
each  regular  expression  R over  2 defines  a set  o(r)  of  strings 
over  2 as  follows: 


(2a)  a( a)  = (A)  J °{P)  = P ; a(a)  = [a]  for  ac2  . 

(2b)  o(R1uR2)  = o(R1)yo(R2)  = (w  | w e a(R1)  or  wea^)}  ; 

a(R][.R2)  = a(R1).a(R2)  * {w^  | e a(R1)  and  w2ecr(R2)}  j 

00  , 

a(R^)  = ij  cr(R1)k  , where  a^)0  = {A]  and  a(R^)1  = a(R1)i_1. a(R1) 

Two  regular  expressions  R^  and  R2  are  equivalent 
if  a(R1)  = o(R2)  . A regular  expression  R is  simple  if  R = P or 
R does  not  contain  p as  a subexpression.  We  can  transform  any  regular 


Note  that  the  symbol  A represents  both  the  regular  expression  " A " 
and  the  empty  string.  Henceforth  we  shall  avoid  using  quotation  marks 
and  allow  the  context  to  resolve  this  ambiguity;  similarly  for  p , (J  > 
• , * . We  shall  also  freely  omit  parentheses  in  regular  expressions 
when  the  meaning  is  clear. 
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expression  R into  an  equivalent  simple  regular  expression  by  repeating 
the  following  transformations  until  none  is  applicable:  (i)  replace  any 
subexpression  of  the  form  0*R^  or  R^*0  by  0 ; (ii)  replace  any 
subexpression  of  the  form  0 + R^  or  R^+0>  by  R^  ; (iii)  replace  any 
subexpression  of  the  form  fi  by  A . 

A regular  expression  R is  non-redundant  if  each  string  in  cr(R) 
is  represented  uniquely  in  R . A more  precise  definition  is  as  follows; 


(3a) 

(3b) 


A , p , and  a for  aeZ  axe  non-redundant. 

Let  R and  be  non-redundant. 

R1UR2  is  non-redundant  if  o(R  )f|cr(R2)  = ft  . 

R1*R2  is  non-redundant  if  each  w e o(R^-Rp)  is  uniquely 
decomposable  into  w = w^w2  with  w^ec(R^)  and 
w2€o(R2)  . 

* * 

R^  is  non-redundant  if  each  non-empty  w e R^  is  uniquely 

decomposable  into  w = wn  w . . . w,  with  w.  e o(r  ) 

jl  2 K 1 1 

for  1 < i < k . 


. . * 

Note  that  if  Ae  cr(R)  , then  R is  redundant. 

Let  G = (V,E)  be  a directed  graph.  We  can  regard  any  path  in  G 
as  a string  over  E , but  not  all  strings  over  E are  paths  in  G . 

A path  expression  P of  type  (v,  w)  is  a simple  regular  expression 
over  E such  that  every  string  in  °(P)  is  a path  from  v to  w . 
Every  subexpression  of  a path  expression  is  a path  expression,  whose 
type  can  be  determined  as  follows. 
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(4)  Let  P be  a path  expression  of  type  (v,w)  . 

If  P = P UP2  y then  P1  and  P2  are  path  expressions  of  type 
(v,w)  . 

If  P = ’ there  mus’fc  be  a lin-i qvie  vertex  u such  that  P1 

is  a path  expression  of  type  (v,u)  and  P2  is  a path 
expression  of  type  (u,w)  . 

■fc- 

If  p = p1  , then  v = w and  P^  is  a path  expression  of  type 
(v,w)  = (v,v)  . 

It  is  easy  to  verify  (4)  using  the  fact  that  P is  simple. 


3. 


Shortest  Paths. 


Let  G = (V,  E)  he  a directed  graph  with  an  associated  real-valued 

cost  c(e)  for  each  edge  e . A shortest  path  from  v to  w is  a 

k 

path  p = e ,e  , ...,e  from  v to  w such  that  £ c(e.)  is  minimum 

i = 1 1 

over  all  paths  from  v to  w . If  G contains  no  cycles  of  negative 

total  cost,  there  is  a shortest  path  from  v to  w if  there  is  any 

path  from  v to  w . The  single-source  shortest  path  problem  is  to  find, 

for  each  vertex  v , the  cost  of  a shortest  path  from  s to  v , where  s 

is  a distinguished  source  vertex.  The  all-pairs  shortest  path  problem  is 

to  find  the  cost  of  a shortest  path  from  v to  w for  all  vertex  pairs  v,  w . 

We  can  use  path  expressions  to  solve  shortest  path  problems  by  means 

of  two  mappings,  cost  and  shortest  path  , defined  as  follows. 

(5a)  cost (a)  = 0 , shortest  path ( A)  = A ; 

cost  {fi)  - oo  , shortest  path(0)  = no  path  ; 
cost(e)  = c(e)  , shortest  path(e)  = e for  eeE  . 

(5b)  cost(P1UP2)  = min { cost (P1),co£t(P2)}  , 

shortest  path (P^ u P^ ) = if  cost(P^)  < cost(P^)  then  shortest  path(p^) 

else  shortest  path(P^)  ; 

cost(P-L.p^)  = cost(P1)  + cost(P2)  , 

shortest  pathfP^-P^)  = shortest  path(P^)  • shortest  path(P  ) ; 

* 

cost(P^)  = if  cost(P-L)  < 0 then  -®  else  0 , 

shortest  path(P^)  = if  cost  (P^)  < 0 then  no  shortest  path  else  A . 
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Lemma  1.  Let  P be  a path  expression  of  type  (v,w)  . If  cost(P)  = <*>  , 


there  is  no  path  in  cr(P)  . if  cost(P)  = -®  , there  are  paths  of  arbitrarily 
email  cost  in  cr(P)  . Otherwise,  shortest  path(P)  is  a minimum  cost 
path  in  cr(P)  , and.  the  cost  of  shortest  path(P)  is  cost(P)  . 

Proof.  Straightforward  by  induction  on  the  number  of  operation  symbols 
in  P . □ 

Theorem  1.  Let  P(v,w)  be  a path  expression  representing  all  paths 

from  v to  w . If  cost(P(v,w))  = 00  , there  is  no  path  from  v to  w . 

If  cost  (p(v,  w) ) = -00  , there  are  paths  of  arbitrarily  small  cost  from  v 
to  w . Otherwise,  shortest  path(P(v,w))  is  a shortest  path  from  v 
to  w ; the  cost  of  this  path  is  cost(P(v, w) ) . 

Proof.  Immediate  from  Lemma  1.  □ 

Theorem  2.  Let  P^(v, w)  be  a path  expression  such  that  a(P1(v,w)) 
contains  at  least  all  the  simple  paths  from  v to  w . If  there  is  a 
shortest  path  f ran  v to  w , shortest  path(P(v,w))  gives  one  such 
path;  its  cost  is  cost(P(v, w) ) . 

Proof.  Any  shortest  path  is  simple.  □ 

By  applying  Theorem  1 we  can  use  a solution  to  the  single- source 
(or  all-pairs)  path  expression  problem  to  solve  the  single-source  (or 
all -pairs)  shortest  path  problem.  By  Theorem  2 it  is  sufficient  to 
use  path  expressions  representing  only  the  simple  paths  if  we  have  a 
separate  test  for  negative  cycles.  The  following  theorem  provides  such 
a test. 
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Theorem  J . 


Let  s be  a distinguished,  source  vertex  in  G . For  every 


vertex  v , let  P1(s,v)  be  a path  expression  such  that  o(pi(s, 
contains  at  least  all  the  simple  paths  from  s to  v . Then  G 
a negative  cycle  if  and  only  if  there  is  some  edge  e such  that 
cost(P1(s,h(e) ) + e(e)  < cost(P-L(s,t(e) ) ) . 

Proof.  Straightforward.  See  Ford  and  Fulkerson  [10].  □ 


v)) 

contains 
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L.  Systems  of  Linear  Equations. 


The  next  problem  to  which  we  shall  apply  our  technique  is  the 
solution  of  a system  Ax  = b of  linear  equations  over  the  set  (R 
of  real  numbers  [11],  This  problem  has  pitfalls  not  present  in  the  other 
problems  we  examine.  The  system  Ax  = b does  not  always  have 
a solution;  even  if  it  does,  the  solution  need  not  be  unique.  Furthermore 
the  standard  algorithms  for  finding  a solution,  such  as  Gaussian  elimination, 
may  not  succeed  even  if  a unique  solution  exists.  (To  deal  with  this 
difficulty,  numerical  analysts  have  devised  more  complicated  algorithms, 
such  as  Gaussian  elimination  with  pivoting  [11].)  We  shall  avoid  these 
issues  by  proposing  a method  that  almost  always  gives  a solution  when 
one  exists. 

We  begin  by  rewriting  Ax  = b as  -b + (A-I)x  = x , where  I is 
the  nxn  identity  matrix.  Let  xQ  be  a new  variable;  then  the 
system  -b + (A-l)x  = x is  equivalent  to 


, where  A' 


0 0 ^ 

-b  A- 1 j 


and  5 denotes  a zero  matrix  of  the  appropriate  size.  Let  G = (V, E) 

be  the  graph  having  n+1  vertices  (one  for  each  variable  x^  ) and  m 

edges  (one  for  each  non-zero  entry  in  A'  ) such  that  there  is  an 

edge  e with  h(e)  = v.  and  t(e)  = v.  if  and  only  if  the  entry  in 

J — 

row  i and  column  j of  A*  is  non-zero;  let  a(e)  be  the  value 
of  this  entry.  Then  the  system  of  equations  takes  the  form 


(6)  x(s)  = 1 ; x(v)  = Z (a(e)x(h(e) ) ] eeE  and  t(e)  = v)  if  v ^ s , 


where  s = vQ  . 
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We  solve  this  system  by  extending  the  mapping  a to  regular 
expressions  ever  E as  follows. 

(7  a)  a(A)  = 1 ; a (/>)  = 0 . 

(7b)  a(R1uR2)  = a(R1)  + a(R2)  ; 

a(R1*R2)  = a(R1)a(R2)  ; 

a(R*)  = 1/(1 -a(R1))  . 

Note  that  a(R^)  is  defined  if  and  only  if  a(R^)  / 1 . If  R 
is  a regular  expression  over  E , then  a(R)  is  a rational  function  of 
a(e  ),  a(e2),  . . a(em)  and  is  defined  except  on  a set  of  measure  zero 
in  [FT  . Note  also  that  the  operation  of  addition  into  which  union  is 
mapped  is  not  idempotent.  This  forces  us  to  deal  only  with  non-redundant 
regular  expressions. 

Lemma  2.  If  R^  and  R2  are  two  equivalent  non-redundant  regular 
expressions  over  E , then  a(R^)  = a(R^)  whenever  both  a(R^)  and 
a(R^)  are  defined. 

Lemma  2 is  the  hardest  result  in  this  paper,  and  we  shall  postpone 
its  proof. 

Theorem  For  each  vertex  v , let  P(s, v)  be  a non-redundant  path 

expression  representing  all  paths  from  s to  v . If  a(p(s,v))  is 
defined  for  all  v , then  the  mapping  x defined  by  x(v)  = a(P(s,v)) 
satisfies  (6). 
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Proof.  The  only  path  from  s to  s in  G is  the  empty  path;  by 


Lemma  2,  x(s)  = a(P(s, s))  = a(A)  = 1 . If  v ± s , then 
U {P(s,h(e))*e  | eeE  and  t(e)  = v]  is  a non-redundant  regular  expression 
representing  the  set  of  all  paths  from  s to  v . By  Lemma  2, 


x (v ) = a(p(s, v) ) = a( U {P(s,h(e))*e  | eeE  and  t(e)  = v} 

= L (a(e)x(h(e))  | eeE  and  t(e)  = v}  . □ 

Thus  the  mapping  a almost  always  gives  a solution  to  (6).  It 
remains  for  us  to  prove  Lemma  2.  We  employ  Salomaa’s  method  for  showing 
the  completeness  of  an  axiom  system  for  regular  expressions  [28],  We 
shall  use  the  notation  Q = R to  denote  that  o(q)  = a(R)  and  a(Q)  = a(R) 
wherever  both  a(Qj  and  a(R)  are  defined.  A non-redundant  regular 
expression  Q is  equationally  characterized  in  terms  of  non-redundant 
regular  expressions  Q1?  Qg, . ..,Q  if  Q=  Q1  and 

f m \ 


(8) 


Op  = ^ U 0^'e.j  UDC^)  where  D^)  e {0,A}  and 


Qij  e I 1 < k < q) 


for  all  j . 


Lemma  3 ♦ Every  non-redundant  regular  expression  over  E is  equationally 
characterized. 


Proof.  By  induction  on  the  number  of  operation  symbols  in  the  regular 
expression. 


P 


m 

U 0 • e 
j=l  3 


U 0 


A 


UA 


11 


p 


ej  S ^e1u...UA-eJU...U^emu|6  for  l<j<m  . 

Thus  every  atomic  regular  expression  is  equationally  characterized. 

Suppose  Q and  R are  equationally  characterized.  Let  Q.,....Q 

1 q. 

be  non-redundant  regular  expressions  such  that  Q = Q and  (8)  holds. 
Let  R^, . ..,Rr  be  non-redundant  regular  expressions  such  that  R = R^ 
and  (9)  holds. 


(9) 


^ U Rij'ej  j L)D(Rj_)  where  D(IL)  e {0, A } and 


Rij  6 I 1 - k - f°r  a11  ^ 


We  shall  equationally  characterize  QUR  , Q*R  , and  Q , assuming  they 
are  non-redundant. 

Let  l<u<q,  1 £ v < r > 841(1  suppose  Q^UR  is  non-redundant. 
Combining  (8)  and  (9)  we  obtain 


m 


(10)  V,KV  5 [ 


j-i 


i)uD(V 


ud(rv) 


m 


X (QUjURvj),ej  I UD(QuUV 


0=1 


since  if  °(0U)  fl  a(Rv)  = P > then  D(Q^)  = p or  D(Ry)  = P • Furthermore 
U Rv j is  non-redundant  for  1 < j < m . Thus  if  Q u R is 
non-redundant,  the  set  of  equations  (10)  such  that  Q^UR^  is 
non-redundant  equationally  characterizes  Q(jR  = Q-j_  U R-.  • 

Let  l<v<r>  s>0,  and  1 < u^  < u^  < . . . < u^  < q . 


Suppose  Q*RV'J  I!  0^ 

V i=l  i 


is  non-redundant.  If  D(R^)  = p , we 


obtain  from  (8)  and  (9)  that 
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(u)  MA\)  • (£(*^u(As*)Wu(£*v) 


U Q*R  , U 


iv))-j)uD(^u(is)) 


Furthermore  Q*RviU  I U R..  a 1 is  non-rec 

J V i=l  V ) 

D(R)  = A , we  obtain  from  (8)  and  (9)  that 


is  non-redundant  for  1 < j < m . If 


(12)  Q-RvU 


k\)  * (i1(^u^u(iv))-^)uD<v 

u(iD(v) 

* (A(^us«u(Av))'ej) 


UDr'RvUU=i  % 


Furthermore  Q^R^uQ^jU  ^ U ^ j j is  non-redundant  for  1 < j < m . 

It  follows  that  if  Q,*R  is  non-redundant,  we  can  equationally  characterize 


Q*R  = Q-R.^  in  terms  of 


^Q*RVU  ^ j |l<v< 


r,  s < 0 , l<u1<u2<...<uc,<q. 


and  Q*RyU  ^ Qu>  j is 

non-redundant  1 . 

* * J 

Finally  we  must  consider  Q . Suppose  Q is  non-redundant. 
Then  D(Q)  = ft  . From  (8)  we  obtain 


13 


(13) 


* 

Q = 


(; 

V J-i 


Q,Vej  )UA 


Furthermore  Q non-redundant  for  1 < j < m 

Let  s > 1 and  1 < \ < u2  < . . . < us  < q . Suppose  Q* 
is  non-redundant.  if  0(0^ ) = 0 for  1 < i < s , then 


UU1  \ 


<1M  ’’tiHiSj  * Lv-ul  vj-j)u*  - 


where  ^ ^ Si.  j j is  non-redundant  for  1 < j < m 

If  D(Sz  ) = A for  (unique)  i such  that  1 < i < s , 

<15)  Q'(i-i\)  *(w<l*‘(<llJu(ui\j))'ej)UA 


i=l  i- 


where  Q - ( Qjj  U ( _U  0^  , 1 is  non-redundant  for  1 < j < 


in  . It 


follows  that  we  can  equationally  characterize  Q*  in  terms  of 
[Q  }U  j | s>l,  l<u1<u2<...<us<q,  and  Q*.  ^ ^ 

is  non-redundant  J . Q 


i=l  ui 


is  non-redundant 


We  are  now  ready  to  prove  Lemma  2.  We  extend  y , . , = to  ordered 
pairs  of  regular  expressions  by  defining  (c^)  u (%,,y  , u % ^ ^ 

- <W  h-V  • <%’V  * (%’V  lf  only  if 

Sl  s S?  and  ^ 3 R2  * 


Proof  of  Lemma  2.  Suppose  Q and  R are  non-redundant  regular 


expressions  such  that  o(q)  = a(R)  . Let  Q,  R be  characterized  in 
terms  of  [Q^  | 1 < i < q}  , {R^  | 1 < i < r}  by  (8),  (9),  respectively. 
We  construct  a set  X of  pairs  (Q^,  R^)  such  that  °(QU)  = °(RV)  • 

We  begin  with  X = {(Q,R)}  . We  process  pairs  in  X and  add  new 
elements  to  X until  all  pairs  in  X are  processed.  We  process  a 
pair  (Qu^Ry)  as  follows.  By  (15)  and  (16)  we  have 

V * <VV<VV>]W’D<V>  • 


Since  o(q^)  = a(Rv)  , we  have  D^)  = D(Rv)  and  ct(q^)  = a(RVJ0  for 
1 < j < m . We  add  each  pair  (Q^R^)  for  1 < j < m to  X if  it 
is  not  already  present. 

We  obtain  a set  of  pairs  X=  {(Q^,R^  (Q^S^,R^S^ ) } such 

that  s < qr  , for  1 < i < s , and 

(Q^,R^)  h U (Q^^R^)*  (e.^e  ) u (D  ,D. ) , where  each  pair 

j = l J J J J 11 

(Q^,R^)  appears  in  X . 


m 


Consider  the  system  of  equations  x.  = £ a(e  .)x- • + a(D. ) , 

1 j = 1 J 1J  1 

where  x.  . = x^  if  = Q^)  # 'jtiis  system  is  satisfied  by 

k.  j 


x.  = a(Q^)  if  a(Q^)  is  defined  for  1 < i < s and  by  xi  = a(R^ 
if  a(R^)  is  defined  for  1 < i < s . We  can  rewrite  this  system  as 
x = Ax  + b , where  each  entry  in  A is  a linear  combination  of 
aCe^), a(e2), . . .,  a(em)  , or  equivalently  as  (A-l)x  = -b  . This  system 
has  a unique  solution  when  the  determinant  of  A-I  is  non-zero,  which 
is  true  except  for  values  of  a(e^), a(e2), . . . , a(effl)  forming  a set  of 
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measure  zero  in  fR  . Thus  a(Qv  ) = a(Rv  ')  for  1 < i < s except 
on  a set  of  measure  zero.  In  particular  a(Q)  = a(R)  except  on  a set 
of  measure  zero.  Since  a(Q)  and  a(R)  are  rational  functions  of  the 


l 


a(ej)  's. 


a(Q)  = a(R)  when  both  are  defined.  □ 


5 . Continuous  Data  Flow  Problems. 

Many  problems  in  global  code  optimization  can  be  formulated  as 
path  problems  of  the  kind  we  are  considering.  The  general  setting  is 
as  follows.  We  represent  a computer  program  by  a flow  graph 
G = (V, E, s)  . Each  vertex  represents  a basic  block  of  the  program 
(a  block  of  consecutive  statements  having  a single  entry  and  a single 
exit).  Each  edge  represents  a possible  transfer  of  control  between 
basic  blocks.  The  start  vertex  s represents  the  start  of  the  program. 

We  are  interested  in  determining,  for  each  basic  block,  facts  which 
must  be  true  on  entry  to  the  block  regardless  of  the  actual  path  of 
program  execution.  Such  facts  can  be  used  for  various  kinds  of  code 
optimization.  See  Aho  and  Ullman  [2],  Hecht  [Ik],  and  Shaefer  [25]. 

To  represent  the  universe  of  possible  program  facts,  we  use  a set 
L having  a commutative,  associative,  idempotent  meet  operation  A ; 
such  an  algebraic  structure  is  called  a lower  semi -lattice.  If  x and  y 
are  two  possible  program  facts,  x a y represents  the  information  common 
to  both.  We  can  define  a relation  < on  L by  x < y if  and  only  if 
x a y = x . The  properties  of  A imply  that  < is  a partial  order 
on  L [27];  we  interpret  x < y to  mean  that  fact  y contains  more 
information  than  fact  x . We  shall  assume  that  L is  complete,  by 
which  we  mean  that  every  subset  X c L has  a greatest  lower  bound  with 
respect  to  < ; we  denote  this  greatest  lower  bound  by  aX  . If 
X = {x]/x2, . . .,xn]  , then  a X = x^  a a . . . A x^  . We  use  ± to  denote 

AL  , i.e.,  the  minimum  element  in  L . For  any  functions  f and  g 
having  common  domain  and  range  L , we  define  f < g if  and  only  if 
f(x)  < g(x)  for  all  elements  x in  the  domain  of  f and  g . 


Li 
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To  represent  the  effect  of  the  program  on  the  universe  of  facts. 


we  associate  with  each  edge  e a function  f such  that,  if  fact  x 
is  true  on  entry  to  h(e)  and  control  passes  through  edge  e , then 
fg(x)  will  be  true  on  entry  to  t(e)  . We  can  extend  these  functions 
to  paths  by  defining  fp(x)  = x if  p is  the  empty  path, 
f (x)  = (f  of 


o . . . o f ) (x)  if  p = e , e , . . . , e,  . What  we  want 
ek  ek-l  el  1 


to  compute  is  A (fp(j.)  | P is  a path  frcm  s to  v]  for  each  vertex  v 
(We  asstime  the  minimum  fact  j.  is  true  on  entry  to  the  program. ) 

This  discussion  motivates  the  following  definitions. 

A continuous  data  flow  framework  (L, F)  is  a complete  lower  semi- 
lattice L with  meet  operation  a and  a set  of  functions  F:  L -♦  L 
satisfying  the  following  axioms: 


(l6a)  (identity)  F contains  the  identity  function  v . 

(l6b)  (closure)  F is  closed  under  meet,  function  composition,  and  * , 

where  (fAg)(x)  = f (x)  a g(x)  and  f*(x)  = A {^(x)  | i > 0} 

(16c)  (continuity)  For  every  feF  and  XcL  , f ( a X)  = A (f(x)  | xeX]  . 


A continuous  data  flow  problem  consists  of  a flow  graph  G = (V, E, s)  , 
a continuous  data  flow  framework  (L, F)  , and  a mapping  frcm  E to  F ; 
we  use  fg  to  denote  the  function  associated  with  edge  e . The  meet 
over  all  paths  (MOP)  solution  to  this  problem  is  the  mapping  mop  from 
V to  L given  by  mop(v)  = a (fp(i)  | P is  a path  from  s to  v]  . 

We  can  use  path  expressions  to  solve  continuous  data  flow  problems 
by  means  of  the  mapping  f defined  as  follows. 
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(17a)  f(A)  = v 5 
f(e)  = fe  . 

(17b)  f(P1UP2)  = f(P1)Af(P2)  ; 
f(P1-P2)  = f(P2)  . f(P1)  ; 
f(p*)  = f(Px)*  . 

Lemma  U.  Let  P ^ f>  be  a path  expression  of  type  (v,  w)  . Then  for 
all  xeL  , f(P)(x)  = A {fp(x)  | p e o(P)}  . 

Proof.  By  induction  on  the  number  of  operation  symbols  in  P . The 
lemma  is  immediate  if  P is  atomic.  Suppose  the  lemma  is  true  for 
path  expressions  containing  fewer  than  k operation  symbols,  and  let 
P contain  k operation  symbols.  We  have  three  cases. 

Suppose  P = Pp  U Pg  • Then 

f(P)(x)  = f(P1)(x)  a f(P2)(x)  = (A  {fp(x)  | pe  a(pi)})  A ( A [f^(x)  | pe  o(P2)}) 

= A(fp(x)  |x€o(P1)yo(P2)]  = A (fp(x)  | p e cj(P)}  . 

Suppose  P = P *P2  . Then 

f(P)(x)  = f(P2)(f(P1)(x))  = f(P2)(A{f  (x)  |p1eo(P1)}) 

= A{f(P2)(f  (x))  |p1ea(P1)}  by  continuity 

= A { A lfp  p (*)  I P2  e a(P2)  H Pi  e °(pi)  ) 

= A {fp^(x)  | px  e a(P1)  and  p2  € o(P2)]  = A [fp(x)  | p e o(p)  } . 
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p 


Similarly  we  can  show  that  if  Pp  has  fewer  than  k operation  symbols 

then  f(P1)1(x)  = a {f  (x)  |pe  o(P1)i]  for  any  i > 0 . 

* 

Suppose  P = Pp  . Then 


f(P)(x)  = f(P1)*(x)  = a {f(P1)1(x)  | i > 0} 

= A ( A {fp(x)  | p e o(P1)1}  | i > 0]  = A (fp(x)  | p e o(P*)  ) . □ 


Theorem  3 . For  any  vertex  v , let  P(s,v)  be  a path  expression 
representing  all  paths  from  s to  v . Then  mop(v)  = f(p(s,v))(x)  . 


Thus  we  can  use  a solution  to  the  single- source  path  expression 
problem  to  solve  continuous  data  flow  problems.  For  examples  and  extensive 
discussions  of  such  problems  see  Cousot  and  Cousot  [5]>  Fong,  Kam,  and 
Ullman  [9]>  Graham  and  Wegman  [13],  Kam  and  Ullman  [16,17],  Kildall  [19], 
and  Rosen  [23]. 


f 


■ 


6.  Monotone  Data  Flow  Problems, 

Many  important  global  flow  problems  are  not  continuous  [17].  For 
such  problems  there  is  in  general  no  algorithm  to  compute  the  meet  over 
all  paths  solution  [17L  and  we  must  be  satisfied  with  less  information 
than  the  MOP  solution  provides.  In  such  situations  the  following  approach 
is  appropriate. 

A monotone  data  flow  framework  (L,  F)  is  a complete  lower  semi- 
lattice L with  meet  operation  A and  a set  of  functions  F:  L -»  L 
satisfying  the  following  axioms : 

(18a)  (identity)  F contains  the  identity  function  z,  . 

(l8b)  (closure)  F is  closed  'under  meet  and  function  composition. 

(l8c)  (monotonicity)  For  every  f eF  and  x,  yeL  , x < y implies 
f(x)  < f (y)  . 

-X- 

(l8d)  (approximation  to  f ) For  every  function  f eF  , there  is 
a function  f e F such  that 

(i)  f®(x)  < f^x)  for  all  xeL  , i > 0 ; and 

(ii)  if  x,ye  L satisfy  f(x)  Ay  > x , then  f^(y)  > x . 

Monotone  frameworks  generalize  continuous  frameworks  by  requiring  only 

monotonicity  (18c)  in  place  of  continuity  (l6c)  and  by  requiring  only  a 

* 

pseudo  transitive  closure  function.  Note  that  f is  the  maximum 
function  satisfying  (l8d). 

A monotone  data  flow  problem  consists  of  a flow  graph  G = (V,  E,  s)  , 
a monotone  data  flow  framework  (L, F)  , and  a mapping  from  E to  F 
whose  values  we  denote  by  f for  e e E . A fixed  point  for  this  problem 
is  a mapping  z:  V - L such  that 
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(19)  z(s)  = i and  fg(z(h(e)))  > z(t(p))  for  any  eeE  . 


A safe  solution  to  the  data  flow  problem  is  a mapping  x:  V -•  L such  that 

(20a,  x(v)  < fp(j.)  for  any  vertex  v and  any  path  p from  s 

to  v ; and 

(20b)  x(v)  > z(v)  for  any  fixed  point  z and  any  vertex  v . 

Thus  a safe  solution  is  a conservative  approximation  to  the  MOP  solution  which 
is  at  least  as  informative  as  any  fixed  point.  It  is  easy  to  prove  that 
any  fixed  point  satisfies  (20a);  if  the  data  flow  problem  is  continuous, 
the  MOP  solution  is  the  maximum  fixed  point  [19]. 

We  can  use  a slight  variant  of  the  mapping  defined  in  Section  k to 
compute  a safe  solution  to  a monotone  data  flow  problem.  Let  f be 
defined  as  in  (17 ),  except  f(p*)  = f(P^)®  . 

Lemma  5.  Let  P ^ fi  be  a path  expression  of  type  (v,  w)  . Then 
f(P)(x)  < f (x)  for  all  peS(P)  and  xeL  . 

Proof . By  induction  on  the  number  of  operation  symbols  in  P . The 
lemma  is  immediate  if  p is  atomic.  Suppose  the  lemma  is  true  for  path 
expressions  containing  fewer  than  k operation  symbols,  and  let  p 
contains  k operation  symbols.  We  have  three  cases. 

Suppose  P = P1  u P2  and  p e P . If  p e p then 

f(P)(x)  = f(P-j_)  (x)  A f (P^)  (x)  < f(P1)(x)  < fp(x)  by  the  induction  hypothesis; 

similarly  if  p e P^  . 
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Suppose  P = P^»P  P = PpP2  with  Pp  e Pp  > P2  e p • Then 

f(P)(x)  = f(P2)(f(P1)(x))  < f(P2)(fp^(x))  < (fp  .f^K*)  = fp(x) 

by  monotonicity  and  the  induction  hypothesis. 

-X- 

Suppose  P = P and  p = p p . . . p,  with  p.  eP  for  1 < i < k . 

JL  -L  2 1 X *“ 


f(p)(x)  = f(Pp)®(x)  < f(P1)k(x)  by  (8d)(i) 


< f (x) 

- pv  ' 


by  monotonicity  and  the  induction 
hypothesis,  as  above.  □ 


Lemma  6.  Let  P ^ p be  a path  expression  of  type  (v, w)  . If  z is 
any  fixed  point,  then  f(P)(z(v))  > z(w)  . 

Proof.  By  induction.  The  lemma  is  immediate  if  P is  atomic.  Suppose 
the  lemma  is  true  for  path  expressions  containing  fewer  than  k operation 
symbols,  and  let  P contain  k operation  symbols.  We  have  the  usual 
three  cases. 

Suppose  P = P1UP2  . Then  f(P)(z(v))  = f(P1)  (z(v) ) A f(P2)  (z(v) ) 

> z(w)  by  the  induction  hypothesis. 

Suppose  P = P^'Pg  • Let  u "the  vertex  such  that  P^  is  of 
type  (v,u)  and  P2  is  of  type  (u,w)  . Then  f(P)(z(v))  = 
f(P  ) (f(Pp) (z(v) ) ) > f(P  )(z(u))  > z(w)  by  the  induction  hypothesis. 
Suppose  P = P . By  the  induction  hypothesis,  f (P^) (z(v) ) A z(v) 

> z(v)  . By  (8d)(ii),  f(P)(z(v))  = f(P1)@(z(v))  > z(v)  . □ 
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Theorem  6.  For  each  vertex  v , let  P(s,v)  be  a path  expression 
representing  all  paths  from  s to  v . Then  the  function  x;  V - L 
defined  by  x(v)  = f (P(s,  v) ) (x)  is  a safe  solution. 

Proof.  By  Lemma  5,  x(v)  = f(P(s,v))(x)  < f (x)  for  all  peS(P(s,v))  ; 

thus  x satisfies  (20a).  Let  z be  any  fixed  point.  By  Lemma  6, 

x(v)  = f(P(s, v))(x)  = f (P(s,v) ) (z(s) ) > z(v)  ; thus  x satisfies  (20b).  □ 


7.  Bounded  Data  Flow  Problems. 

Most  interesting  data  flow  problems  satisfy  a stronger  condition  on  L 
than  completeness,  called  the  descending  chain  condition;  every  descending 
chain  x^  > > x^  > . . . in  L is  finite.  For  semi-lattices  satisfying 

the  descending  chain  condition,  continuity  is  equivalent  to  distributivity: 
f(xAy)  = f (x)  A f (y)  for  all  feF  and  x,yeL  . Our  continuous  data 
flow  problems  are  thus  a generalization  of  the  distributive  data  flow 
problems  considered  by  KLldall  [19] . Although  most  global  flow  problems 
satisfy  the  descending  chain  condition,  some,  such  as  type  checking  [33], 
do  not. 

If  the  set  of  functions  F in  a data  flew  framework  satisfies  a 
boundedness  condition,  then  we  can  compute  an  approximation  f®  to  f 
| for  any  function  feF  using  only  function  meet  and  composition.  If 

the  framework  is  continuous  as  well,  it  is  possible  to  compute  the  MOP 
solution  from  a set  of  path  expressions  representing  only  some  of  the 
paths  from  the  start  vertex.  We  shall  consider  a hierarchy  of  boundedness 
axioms.  For  k > 1 , a k-bounded  data  flow  framework  (L, F)  is  a 
complete  lower  semi -lattice  L with  meet  operation  A and  a set  of 
functions  F:  L -»  L satisfying  identity  (l8a),  closure  (l8b), 
monotonicity  (18c),  and 

(21)  (k-boundedness)  f^(x)  > A{fi(x)  | 0 < i < k-1]  for  all  feF  and  xeL  . 

For  k > 1 , a k- semi -bounded  data  flow  framework  (L, F)  is  a complete 

__ 

lower  semi-lattice  L with  meet  operation  a and  a set  of  functions 
F:  L -L  satisfying  (l8a),  (l8b),  (l8c),  and 

(22)  (k-  s emi -boundedne s s ) f*(x)  > (a  {^(x)  |o  < i < k-i})Afk(y) 

for  all  feF  and  x,  y e L . 
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We  define  k-bounded  and  k- semi-bounded  data  flow  problems  in  the 
obvious  way.  It  is  easy  to  show  that  k-boundedness  implies  k-semi- 
boundedness  and  k-semi -boundedness  implies  (k+1)  -boundedness. 
Boundedness,  being  a property  of  F and  not  of  L , is  neither 
stronger  nor  weaker  than  the  descending  chain  condition.  The  k-bounded 
and  k-semi -bounded  data  flow  problems  include  seme,  but  not  all,  of  the 
global  flow  problems  mentioned  in  the  literature.  Problems  that  use 
bit  vectors,  such  as  finding  available  expressions  [31]  and  finding 
live  variables  [18]  are  1- semi -bounded  but  not  1-bounded.  Problems 
that  use  "structured  partition  lattices",  such  as  common  subexpression 
detection  [9>l6,19]>  are  2 -bounded  but  not  1- semi -bounded.  Type  checking 
[33]  is  not  k-bounded  unless  some  bound  is  artificially  imposed. 


Lemma  7.  In  a k-bounded  data  flow  framework  (L, F)  , 
f*  = A [f1  | 0 < i < k-1]  for  all  feF  . 


Proof.  We  prove  by  induction  on  j that  if  j > k , 
fJ(x)  > A [f1(x)  | 0 < i < k-1]  for  all  feF  and  xeL 
is  true  for  j = k by  k-boundedness.  Suppose  j > k 
is  true  for  j-1  . Then 

fJ(x)  = fJ_1(f(x))  > a ff^x)  | 1 < i < k] 


> A [f 1 (x)  | 0 < i < k-1] 


. The  claim 
and  the  claim 


by  the  induction 
hypothesis 

by  k-boundedness. 


The  lemma  follows  from  the  claim.  □ 


Lemma  8.  In  a k-bounded  data  flow  framework  (L,  F)  , the  function  f® 
defined  by  f®  = (fAs)  1 for  feF  satisfies  (l8d). 
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iroof.  By  repeated  use  of  monotonicity,  we  obtain 


f*(x)  = (fA^)k-1(x)  < A (f^x)  |0  < i < k-1}  , which  implies  (l8d)(i) 

by  Lemma  7*  We  prove  by  induction  on  j that  if  f(x)Ay  > x , 
then  (f  Ai-)^(y)  > x . The  result  is  immediate  for  j = 0 . Suppose 
(f  A z,)^-^(y)  > x • Then  (fAz-)J  (y)  > f(x)  a x > x . Thus 
f (x)  A y > x implies  f®(x)  = (f  Ai)*1  ^(x)  > x , and  (l8d)(ii)  holds.  Q 


If  (L, F)  is  a k-bounded  data  flow  framework  and  feF  , we  can 
* 

compute  f using  0(k)  function  meets  and  compositions  by  Lemma  7* 

We  can  compute  an  approximation  f®  to  f*  in  O(log  k)  function  meets  and 

compositions  by  Lemma  8.  (We  trade  accuracy  for  time  if  we  compute  f® 

* , 

instead  of  f . ) Theorem  o thus  gives  a method  to  solve  bounded 

data  flow  problems  using  only  function  meet,  composition,  and  application. 

Suppose  (L, F,  G,  f ) is  a data  flow  problem  which  is  not  only 
bounded  but  continuous.  In  this  case  f®  = f , and  we  can  compute 
the  MOP  solution  using  only  function  meet,  composition,  and  application, 
with  O(log  k)  such  operations  replacing  each  * . We  can  also  use 
path  expressions  representing  only  some  of  the  paths  from  s , as 
demonstrated  by  the  next  results. 


Lemma  9.  Let  (L, F,  G,  f ) be  a k-bounded  continuous  data  flow  problem. 
Let  v be  a vertex  in  G and  let  p be  a path  from  s to  v that 

is  not  k-simple.  Then  there  is  a set  S of  paths  from  s to  v such 

that  each  path  in  S is  shorter  than  p and  f^  > a {f  [ qeS}  . 

Proof.  If  p is  not  k-simple,  then  p contains  some  vertex  u at 


least  k+1  times.  Let  p = PQ  Pp  ^2  * * * ^k  Pk+1  ’ where  each  for 

l<i<k  isa  cycle  i 
the  empty  path. ) Then 


1 < i < k is  a cycle  from  u to  u . (Both  pQ  and  Pk+^  may  be 
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by  continuity 


° I 

P0 

kl)J  | 0 < j < k-1}  o f 

*0 

by  k-boundedness 
^ where  0 < l < k-1 

< i < k]  for  1 < j < f}  • □ 

Corollary  1.  Let  (L,  F,  G>  fg)  be  a k-bounded  continuous  data  flow  problem. 
Let  v be  a vertex  in  G and  let  p be  a path  from  s to  v . Then 

f > a ff  | q is  a k-simple  path  from  s to  v } . 

Proof.  By  induction  on  the  length  of  p using  Lemma  9«  □ 

Theorem  7-  Let  ( L,  F,  G,  f ) be  a k-bounded  continuous  data  flow  problem. 
For  each  vertex  v , let  P (s,v)  be  a path  expression  such  that 
S(Pk(s,v))  contains  at  least  all  the  k-simple  paths  from  s to  v . 

Then  mop(v)  = f ( ?k(s, v) ) (x)  , where  f is  defined  as  in  Section  5. 

Proof.  Immediate  from  Lemma  ^ and  Corollary  1.  □ 

Lemma  10.  Let  (L> F,  G,  fg)  be  a k- semi -bounded  continuous  data  flow  problem. 
Let  v be  a vertex  in  G and  let  p be  a path  from  s to  v which 

is  not  k-semi-simple.  Then  there  is  a set  S of  paths  from  s to  v 

such  that  each  path  in  S is  shorter  than  p and  f > A {f  | qeS]  . 


f > f « ( A f f I 1 < i < kl) 

p — p,  , v p.1  — — J 

* ^k+1  *i 


> f * |l<i< 


P' 


k+1 


Pi 


> A{f  |q  - p0qiq2...  qfP; 


k- 


and  q^  e fp±  | 1 
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Proof.  if  p is  not  k- semi -simple,  then  p can  be  partitioned  into 
P = P0  Px  P2  Pj  • • • Pk+2  Pk+5  , where  p1  and  Pp  for  3 < i < k+2  are 
cycles,  and  pQ  , p2  , pk+^  are  possibly  empty.  Then 


f > f 
P “ Pi 


k+3 


(a  (f  | 3 < i < k+2})  » f 


Pi 


> f °A{(A{f  | 3 < i < k+2})J  |0  < j < k-1]  of 


•k+3 


Pi 


Afn  ° (A  I 5 < i < k+2})*  of 


k+3 


pi 


P0P2 


by  continuity 

W2 

by  k-semi -boundedness 
and  continuity 


> ( a f f q I q = P0  Px  P2  q1  ^ . . . q{  pfc+5  where  0 < l < k-1 

and  e {Pi  | 3 < i < k+2}  for  1 < j < l }) 

A ( A {fq  | q = P0  P2  ql  ^ ‘ ' qkpk+3  where  qj  e Cpi  I 5 < 1 < k+2} 

for  1 < j < k})  . □ 

Corollary  2.  Let  (L,  F,  G,  fp)  be  a k-semi -bounded  continuous  data  flow 

problem.  Let  v be  a vertex  in  G and  let  p be  a path  from  s to  v . 
Then  fp  > A {f  | q is  a k-semi-simple  path  from  s to  v } . 

Proof.  By  induction  on  the  length  of  p using  Lemma  10.  □ 

Theorem  8.  Let  (L, F,  G, f ) be  a k-semi -bounded  continuous  data  flow 
problem.  For  each  vertex  v , let  P^.(s,v)  be  a path  expression  such 
that  S(Pk(s,v))  contains  at  least  all  the  k-semi-simple  paths  from 
s to  v . Then  mo|>(v)  = f (P£(s, v) ) (j.)  , where  f is  defined  as  in 
Section  5. 
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Proof.  Immediate  from  Lemma  ^ and  Corollary  2.  □ 


Corollaries  1 and  2 require  continuity;  in  fact,  the  MOP  solution 
is  not  effectively  computable  in  a general  2-bcunded  monotone  data 
flow  problem  [17].  See  Kam  and  Ullman  [16]  and  Tarjan  [29]  for  further 
discussion  of  the  effect  of  boundedness  on  global  flow  analysis. 


8.  An  Idiosyncratic  Data  Flow  Problem. 


I 

n 


As  a final  application  of  our  technique,  we  shall  consider  a data 
flow  problem  that  does  not  fit  naturally  into  the  semi-lattice 
framework,  but  that  can  still  be  solved  easily  using  a mapping  from  path 
expressions.  The  problem  arises  in  the  optimization  of  very-high-level 
languages  and  has  been  studied  by  Fong  [8]. 

Let  G = (V, E, s)  be  the  flow  graph  of  a program  which  contains 
occurrences  of  an  expression  £ . With  each  edge  e of  the  program 
is  associated  an  effect,  which  has  one  of  four  values  depending  upon 
what  flow  of  control  through  edge  e does  to  the  value  of  £ . 

> 


For  any  vertex  v , we  say  £ is  implicitly  available  on  entry  to  v 
if  there  is  a positive  bound  b such  that,  for  every  path 
p = ep  e^, . . . , e^  from  s to  v , there  is  an  i such  that 
(i)  effect  (e^)  = gen  , (ii)  effect (e^. ) ^ kill  for  1 < j < k , 
and  (iii)  the  number  of  values  j such  that  i < j < k and 
effect (e^)  = injure  is  bounded  by  b . Note  that  the  bound  b can 
depend  upon  the  vertex  v but  not  upon  the  path  p . 

The  problem  we  wish  to  solve  is  to  determine  from  f effect (e)  | e e £} 
the  vertices  at  which  £ is  implicitly  available.  The  idea  is  that  if 
the  most-recently-computed  value  of  £ can  be  injured  only  a bounded 
number  of  times  before  entering  v , we  can  compute  the  value  on  entry 


L4 


effect (e)  = / 


gen  > 
kill 


injure 

trans 


C the  program  recomputes  £ 
the  program  makes  a large  change  in  the  value  of  £ 
the  program  makes  a small  change  in  the  value  of  g 
the  program  does  not  affect  the  current  value  of  £ 
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to  v from  the  most-recently-computed  value  by  performing  a bounded 


number  of  updates.  Otherwise,  we  must  completely  recompute  £.  to  obtain 
its  value  on  entry  to  v . 

Fong  [8]  claims  that  this  problem  cannot  be  formulated  within  the 
semi-lattice  framework,  "at  least  in  the  only  natural  choice  of  semi- 
lattice." However,  Fong  observes  that  the  problem  can  still  be  solved 
efficiently.  We  shall  define  a mapping  from  path  expressions  for  this 
purpose. 

Let  D = {g,t  ,t+,u>}  be  a set  having  operations  A , ® , @ defined 
by  the  following  tables. 


A 

g 

c+ 

O 

0) 

o 

g 

*0 

t 

+ 

0) 

& 

g 

g 

*0 

0) 

g 

g 

g 

g 

0) 

g 

to 

to 

*0 

to 

U) 

*0 

g 

*0 

0) 

*0 

*0 

t+ 

t+ 

0) 

*+ 

g 

*+ 

to 

0) 

to 

(0 

U) 

01 

01 

g 

to 

U) 

0) 

0) 

0) 

Let  the  mapping  f from  path  expressions  to  D be  defined  as  follows. 


(21a) 


f(A) 


f(e) 


r g ' 

r gen  ** 

UJ 

< 

^ if  effect (e)  = ^ 

kill 

— 

t+ 

1 

injure 

^ *0  j 

trans 

k.  J 

for  eeE  . 
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(21b)  f(P1UP2)  = f(P1)Af(P2)  ; 

f(Pl°P2)  = f(pi)°f(p2)  5 

f(P^)  = f^)®  . 

We  call  a path  p = e^e^, . . .,e^  in  G a t^  -path  if 

effect(e.)  e (injure,  trans]  for  1 < j < k and  the  number  of  edges  e. 

J J 

such  that  effect (e^)  = injure  is  i . We  call  a path  p a g^  -path 
if  it  can  be  partitioned  into  p = p^,e,p2  , where  effect (e)  = gen 
and  P2  is  a t^  -path.  We  call  a path  p an  m -path  if  it  can  be 
partitioned  into  p = P-]_> e,p2  , where  effect (e)  = kill  and  p is  a 
t^  -path  for  some  i . 

Lemma  11.  Let  P be  a path  expression.  Then 

(i)  f(P)  = g if  there  is  a bound  b such  that  every  path  in  cr(p) 

is  a g^  -path  with  i < b ; 

(ii)  f(P)  = t^  if  there  is  a bound  b such  that  every  path  in  c(p) 

is  either  a g^^  -path  with  i < b or  a tG  -path,  and  o(p) 

contains  at  least  one  t^  path. 

(iii)  f(P)  = tQ  if  there  is  a bound  b such  that  every  path  in  o(l 

is  either  a g^  -path  with  i < b or  a t^  -path  with  i < b , 

and  o( p)  contains  at  least  one  t^  -path  with  i > 0 . 

(iv)  f(P)  = uo  in  all  other  cases.  (For  any  bound  b , o(p)  contain 

either  a g.^  -path  with  i > b , a ^ -path  with  i > 0 , or 
an  uj-path.) 

Proof.  Straightforward  but  tedious,  by  induction  on  the  number  of 
operation  symbols  in  P . □ 
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Theorem  9.  For  each  vertex  v in  G , let  P(s,v)  be  a path  expression 
representing  all  paths  frctn  s to  v in  G . Then  £ is  implicitly 
available  at  v if  and  only  if  f(P(s,v))  = g . 


i 

1 


Proof.  Immediate  from  Lemma  11. 

Actual  occurrences  of  the  implicit  availability  problem  usually 
involve  a number  of  expressions.  We  can  perform  the  computation 
associated  with  Theorem  9 in  parallel  for  all  the  expressions  by  using 
bit  vector  operations.  Since  D contains  four  elements,  we  need  two 
bit  vectors  for  each  value  computed  (rather  than  the  three  proposed  by 
Fong  [8]).  By  adding  an  additional  element  to  D we  can  compute  the 
explicitly  available  expressions  (those  available  with  no  injuries)  in 
addition  to  the  implicitly  available  ones. 


9.  Remarks . 

We  have  shown  how  to  use  path  expressions  to  solve  three  kinds  of 

path  problems  on  directed  graphs.  Our  results  allow  us  to  build  a 

general  algorithm  for  solving  path  problems  on  directed  graphs;  to  solve 

a particular  path  problem,  we  merely  interpret  U > ' > and  * 

appropriately.  We  can  base  such  an  algorithm  on  Gaussian  or  Gauss  - Jordan 
elimination  [21],  Tarjan  [30]  discusses  another  algorithm,  which  is 
especially  efficient  on  reducible  and  almost-reducible  graphs  [15,28]. 

Our  results  serve  to  formally  justify  the  empirical  observation 
that  the  same  algorithms  work  on  many  different  path  problems.  There 
are  of  course  algorithms  that  solve  only  a particular  kind 
of  path  problem,  such  as  Dijkstra's  [6]  and  Fredman's  [12]  shortest 
path  algorithms  and  Pan's  improvement  to  Strassen's  algorithm  for  solving 
linear  equations  [h,22,26] . However,  any  algorithm  able  to  compute  path 
expressions  also  solves  all  the  path  problems  we  have  considered  here. 

Our  ideas  extend  easily  to  matrix  multiplication  problems  and  to 
problems  requiring  the  transitive  closure  of  a matrix.  See  Aho,  Hopcroft, 
and  Ullman  [1]  and  Lehman  [21]  for  discussions  of  such  problems. 
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Appendix:  Graph-Theoretic  Definitions 


A directed  graph  G = (V,  E)  is  a finite  set  V of  vertices  and  a 
finite  set  E of  edges  such  that  each  edge  e has  a head  h(e)  e V and 
a tail  t(e)  e V . We  regard  the  edge  e as  leading  from  h(e)  to  t(e)  . 
A path  p = ei,e2> *'',ek  is  a seTuence  of  edges  such  that  t(e^)  = h(e 
for  1 < i < k-1  . The  path  is  from  h(e^)  to  t(e^)  • The  path  contains 
ed€ea  e1>  e2^  * • *•»  ek  ^ vertices  h(e]L),h(e2), . . .,h(ek),t(e  ) , and 
avoids  all  other  edges  and  vertices.  There  is  a path  of  no  edges  from 
any  vertex  to  itself.  A cycle  is  a non-empty  path  from  a vertex  to 
itself. 

If  there  is  a path  from  a vertex  v to  a vertex  w , then  v is 
reachable  from  v . A flow  graph  G = (V, E, s)  is  a graph  containing 
a distinguished  start  vertex  s such  that  every  vertex  is  reachable 
from  s . 


A simple  path  p is  a path  containing  no  vertex  twice.  For  k > 1 , 
a k- simple  path  is  a path  containing  no  vertex  k+1  times.  Thus  a 
1-simple  path  is  simple.  A k-semi-simple  path  is  a path  p that  can 
be  partitioned  as  p = p^, e,  p2  , where  p is  simple,  e is  an  edge, 
and  p0  is  k- simple. 
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